A ultrasound-based radiomic approach to predict the nodal status in clinically negative breast cancer patients

In breast cancer patients, an accurate detection of the axillary lymph node metastasis status is essential for reducing distant metastasis occurrence probabilities. In case of patients resulted negative at both clinical and instrumental examination, the nodal status is commonly evaluated performing the sentinel lymph-node biopsy, that is a time-consuming and expensive intraoperative procedure for the sentinel lymph-node (SLN) status assessment. The aim of this study was to predict the nodal status of 142 clinically negative breast cancer patients by means of both clinical and radiomic features extracted from primary breast tumor ultrasound images acquired at diagnosis. First, different regions of interest (ROIs) were segmented and a radiomic analysis was performed on each ROI. Then, clinical and radiomic features were evaluated separately developing two different machine learning models based on an SVM classifier. Finally, their predictive power was estimated jointly implementing a soft voting technique. The experimental results showed that the model obtained by combining clinical and radiomic features provided the best performances, achieving an AUC value of 88.6%, an accuracy of 82.1%, a sensitivity of 100% and a specificity of 78.2%. The proposed model represents a promising non-invasive procedure for the SLN status prediction in clinically negative patients.

www.nature.com/scientificreports/ paresthesia, lymphedema, and hematoma [3][4][5][6] . Furthermore, a specialist in nuclear medicine afferent to a nuclear medicine department, within the hospital or connected to it, is needed. Therefore, the development of a non-invasive preoperative assessment of the SLN status is attracting growing attention 7 . So far, in the state-of-the-art, several studies have proposed reliable alternative to the SLNB for the nodal status prediction in breast cancer patients. Particularly, more recent works involved not only clinical factors, but also radiomic features extracted by primary breast tumor images acquired at diagnosis in different ways [8][9][10] . In the work of Yang et al. 8 the authors developed a nomogram which combined clinical information with textural and shape features extracted from ROIs identified on presurgical mammography images. The approach proposed in the work of Santucci et al. 9 aimed to predict the nodal status by combining histological features with radiomic features computed on ROIs extracted from 3 Tesla post contrast-magnetic resonance images. In the work of Liu et al. 10 the authors evaluated the predictive power of radiomic features extracted from ROIs identified on dynamic contrast-enhanced magnetic resonance images.
Other research works performed a radiomic analysis on ROIs extracted from primary breast tumor ultrasound images acquired at diagnosis [11][12][13][14] . However, to the best of our knowledge, there are not studies which propose ultrasound radiomic-based model for the SLN status prediction in clinically negative breast cancer patients, that are patients characterized by an early stage breast cancer and whose nodal positivity is difficult to diagnose.
As a matter of fact, there are few models designed to clinically negative patients, and these are based on only clinical features [15][16][17][18] . In our previous work 17 , we used clinical and histopathological features to implement machine learning predictive models. Thus, to improve our previous results, in this study we developed a preoperative tool for the SLN metastatic status prediction in clinically negative patients, analyzing radiomic features extracted from primary tumor ultrasound images. Indeed, ultrasound represents the least expansive and invasive technique with respect to other diagnostic tools, such as mammography, magnetic resonance and contrast-enhanced devices. Particularly, after a radiomic analysis performed on different ROIs by means of four gray-level occurrence matrices, we compared different approaches which evaluated clinical and radiomic features, both individually and combined, to identify a proper model to replace the OSNA procedure without compromising the diagnosis accuracy.

Results
Data description and statistical analysis results. This retrospective study was approved by the Scientific Board of the Istituto Tumori "Giovanni Paolo II" of Bari, Italy, and only patients who gave consent to use the data were considered. Particularly, female patients with a first breast cancer diagnosis in the period 2017-2020 were recruited. The eligibility criteria were (a) patients resulted negative at both clinical and instrumental examination, (b) patients who underwent the one-step nucleic acid amplification (OSNA) procedure, (c) patients with known ALN metastatic status and (d) patients with primary tumor ultrasound images acquired at diagnosis. Finally, 142 patients, of which 115 with negative ALN metastatic status and 27 with positive ALN metastatic status, were included in this study. Notably, the dataset imbalance was due to the low incidence of axillary nodal metastasis in clinically negative patients.
Thus, a set of 12 clinical features was obtained and the few missing data (see Table 1) were estimated by means of a proximity algorithm. For each patient with at least one missing clinical feature, the proximity technique allows to replace his missing data with data belonging to the patient without missing features and whose values had the minimum Euclidean distance from the patient under consideration 19 . Due to the exiguity of the sample and with the aim of improving the accuracy of the estimated data, the data imputation procedure was implemented on the whole dataset. Besides, potential missing features of newcome patients will be estimated comparing each incomplete observation to the 142 patients belonging to the dataset employed in this study.
An overview about the sample properties is provided by Table 1. Features resulted discriminant in our statistical analysis (with a p-value less than 0.05), namely age, diameter, multifocality and angioinvasion, are highlighted. All the other variables did not show a statistically significant association with the ALN metastatic status. Nevertheless, since in our preliminary analysis the only statistically significant features resulted not discriminant in the outcome prediction, all the clinical features were included in the classification model. Interim results were not reported in order to not burden the discussion.
Classification performances. Besides the clinical set, four radiomic sets were defined for each patient.
Specifically, the radiomic analysis was performed on four different ROIs, namely, original ROI, intra-tumoral ROI, peritumoral ROI and combined ROI, extracted from the primary breast tumor ultrasound image acquired at diagnosis (see "Method"). Thus, a total of eleven learning models was developed to evaluate the predictive power of clinical and radiomic features, first individually and then simultaneously.
For each developed classification model, the hold-out training set consisted of 80% of the input sample, that is 114 randomly selected patients out of which 92 with negative ALN metastatic status and 22 with positive ALN metastatic status. Consequently, the hold-out test set, made up of the remaining 20% of the sample, included 28 patients of which 23 with negative ALN metastatic status and 5 with positive ALN metastatic status. Thus, the www.nature.com/scientificreports/ same percentage of positive patients was selected for both sets by performing the splitting procedure in training and test set according to a random stratified sampling. To proof the robustness of the performed splitting, in a supplementary table (Table S1) we reported the distributions of both sub-sets with respect to all clinical features involved in this study. The statistical analysis returned a p-value greater than 0.05 for ten of the twelve features, confirming the homogeneity which occurs between the training and test sets. Classification performances achieved by all models on the hold-out training set are summarized in a supplementary table (Table S2) to not burden the discussion.
Meanwhile, Table 2 summarizes the classification performances achieved by all models on the hold-out test set. For each metric we specified the 95% confidence interval computed with the R library pROC. Each radiomic-based model is denoted by the name of the ROI from which the features were extracted (e.g., Radiomic original refers to the model obtained considering radiomic features extracted from the original ROI). Moreover, while the Radiomic comb model was designed extracting features from the combined ROIs, the last reported radiomic-based model Radiomic intra + peri refers to the model developed by merging the intra-tumoral and peritumoral datasets.
As regards to all radiomic-based models, the reported performances refer to the classification models trained on the sub-set of features obtained considering only the features selected at least 50% of the time over the leaveone-out cross-validation rounds on the training set. As a matter of fact, this represents the best trade-off between high performances and low-dimensional datasets: as some examples in Fig. 1 show, the AUC values do not undergo large variations as the dataset dimension decreases, that is, as the feature selection frequency increases. To be fair, since 50% was the greatest common frequency among all models, frequencies greater than 50% were not considered. The same criterion was applied for the soft voting-based models' development.
The model developed by means of the only clinical features reached an AUC value of 73.9%, an accuracy of 82.1%, a sensitivity of 60% and a specificity of 86.9%. Regarding the radiomic-based models, rather, the best one in terms of AUC values resulted the Radiomic original one, which achieved an AUC value equals to 75.6%, an accuracy equals to 67.8%, a sensitivity and a specificity equal to 80% and 65.2%, respectively. Furthermore, radiomic features extracted from the original ROI resulted the most significant even when evaluated in association with the clinical features within the soft voting (SV) approach. Particularly, the Clinical/Radiomic original (SV) model reached an AUC value of 88.6%, an accuracy of 82.1%, a sensitivity of 100% and a specificity of 78.2%.

Discussion
An early and accurate detection of the SLN metastatic status is essential not only to optimally treat breast cancer patients from a surgical point of view, but even to reduce the recurrence and/or distant metastasis occurrence probabilities. Nowadays, the guidelines provide for the one-step nucleic acid amplification (OSNA) as the intraoperative method for the SLN metastatic status detection in clinically negative breast cancer patients. The OSNA procedure achieves high performances, nevertheless it is a time-consuming and expansive examination which could lead to several side effects 3 . Furthermore, since the percentage of clinically negative patients which develop nodal metastasis is approximatively 15%, this could be an unnecessary invasive procedure. Thus, the aim of this work was to devise a model able to predict the SLN metastatic status in clinically negative breast cancer patients, for proposing a non-invasive alternative to the OSNA procedure. Firstly, 142 clinically negative breast cancer patients which underwent the OSNA in our Institute were recruited. For each patient, a primary tumor ultrasound image acquired at diagnosis and several clinical features, such as age, diameter, grading, histological type, ER, PgR, ki67, HER2/neu, quadrant, multifocality, angioinvasion and invasiveness, were collected. Then, from each ultrasound image, four different frames, namely original ROI, intra-tumoral ROI, peri-tumoral ROI and combined ROI, were extracted and their texture was analyzed by means of radiomic features computed on four different gray-level occurrence matrices (see "Method"). Finally, eleven classification models based on an SVM classifier were developed, and their performances were evaluated on a hold-out test set in terms of AUC, accuracy, sensitivity, and specificity. All the radiomic-based models, additionally, were trained on the sub-set of features selected by a genetic algorithm within a leave-one-out cross-validation procedure on the training set (see "Method").
The Clinical model reached an AUC, an accuracy, a sensitivity, and a specificity equal to73.9%, 82.1%, 60% and 86.9%, respectively. Among the radiomic-based models, the best performing in terms of AUC values was the model developed by means of the radiomic features extracted from the original ROI. Particularly, the Radiomic original model achieved an AUC of 75.6%, an accuracy of 67.8%, a sensitivity of 80% and a specificity of 65.2%. This model turned out to be the optimal choice also when evaluated in association with the clinical-based model within the soft voting approach. The Clinical-Radiomic original (SV) model, indeed, resulted the best one reaching an AUC of 88.6%, an accuracy of 82.1%, a sensitivity of 100%, and a specificity of 78.2%. Comparing these results with those obtained on the intra-tumoral ROI, it can be noticed that the influence of the peritumoral region in the original ROI provides discriminant information for the ALN metastatic status prediction. Furthermore, we can notice that the radiomic model on the original ROI outperforms the radiomic model intra + peri. These findings may be justified by the fact that the original ROI captures a large zone of peritumoral tissue, i.e., the tissue connecting between tumor and normal tissue, which has been demonstrated to be a site of tumor proliferation and angiogenesis 20 .Thus, the related extracted radiomic characteristics might higher correlated with the metastatic nodal status to be predicted. As a matter of fact, a statistical analysis proved the only statistically significant radiomic-based model resulted the one which exploited radiomic features extracted from the original ROI.These results are in agreement with research studies related to evaluation of tumor stiffness on US elastography with axillary nodal metastasis in early-stage breast cancer patients, where tumor stiffness was computed as the displacement of each pixel relative to the surroundings and converted into a colour display in real time 21 . Accordingly, in our future works we will investigate the nodal status prediction at varying the thickness of the peritumoral regions within the extracted ROIs.
The proposed model outperforms the ones developed in our previous works by means of the only clinical features. Indeed, in our first work Fanizzi et al. 16 , an AUC value of 68%, an accuracy of 50.5%, a sensitivity of 69.8% and a specificity of 45.5% were achieved, on an independent test, considering the following features: diameter, age, histological type, grading, ER, PgR, ki67 and HER2. Comparable results were reached in our succeeding work Fanizzi et al. 17 , implementing a machine learning approach on the same clinical features, with the addition of two histological characteristics, such as multifocality and in-situ component. Our choice of reporting the performances of all investigated models had the purpose of highlighting the fundamental role of radiomic features for improving the performances of clinical-based predictive models. These results are consistent with the role of radiomics debated in the state-of-the-art 22 . Specifically, the ultrasound central role in the development of such a promising radiomic model for the nodal status prediction increases the importance of this non-invasive and inexpensive diagnostic procedure, especially as the dependence on the operator could be overcome through automatic acquisition system evaluations 23 .
Our performances exceed also the results obtained by authors in the work of Dihge et al. 15 . They developed different artificial neural network-based models for the nodal status prediction in clinically negative patients by means of both clinical and radiological features, reaching a best AUC value of 74%.
An overwiew of the performances achieved by state-of-the-art models designed to clinically negative breast cancer patients is provived by Table 3.
While the above-mentioned approaches involved only clinical features, other approaches were developed considering also features extracted from ultrasound images by different techniques 11-14 . In the work of Qiu et al. 11 the authors extracted radiomic features from ultrasound images belonged to 196 breast cancer patients and developed a nomogram which achieved an AUC of 75.9% in the validation cohorts. Similarly, in the work of Zhou et al. 12 the authors analyzed radiomic features extracted from ultrasound images of 192 breast cancer patients, reaching an AUC equals to 65% on the test set.
The approach proposed in the work of Sun et al. 13 compares the performances achieved by radiomic-based and deep learning-based models implemented on features extracted from ultrasound images of 479 patients. Specifically, the images the authors used correspond to the intra-tumoral, peritumoral and combined images of our study. On the validation cohorts they obtained the best results analyzing the combined ROI, reaching an AUC equals to 83.3% with the radiomic-based model and an AUC value of 95% with the deep-learning one. As www.nature.com/scientificreports/ well as in our analysis, comparing the best results to the ones obtained by authors on the intra-tumoral ROI, the discriminant role of the peritumoral region in the ALN status prediction is confirmed. A further deep learning-based model was developed in the work of Zheng et al. 14 . The authors achieved an AUC value of 90.2% combining clinical parameters and deep learning features extracted from ultrasound images belonged to 584 breast cancer patients.
Although our model is not yet suitable for clinical practice due to the low specificity and the exiguity of the sample, it represents an improvement of the state of art. Indeed, even if the last above-discussed models reached high performances, our approach is the only radiomic-based model designed for clinically negative breast cancer patients. Furthermore, our model is capable of correctly identifying all the metastatic sentinel lymph nodes, despite the greater difficulty in diagnosing the nodal positivity in clinically negative patients, that are patients characterized by an early stage breast cancer. Thus, it is a promising and inexpensive method which could replace the OSNA procedure without compromising the patient care.
Anyway, encouraged by the better results commonly achieved by deep learning-based models developed in the state-of-the-art within the ultrasound analysis framework 13,14 , in our future work we will devise a deep learning approach for predicting SLN metastatic status in clinically negative breast cancer patients, with the aim of improving the specificity starting from both the clinical features and the ultrasound images acquired at diagnosis.

Method
Ultrasound image acquisition and pre-processing. Ultrasound scans were performed by means of a PHILIPS Affiniti 70 ultrasound system with L12-5 50 linear probe [5][6][7][8][9][10][11][12] 256 elements, 50 mm, fine pitch. Subsequently, ultrasound images enclosing the neoplastic lesion were retrospectively acquired by the Picture Archiving and Communication System (PACS), and for each patient an image was selected by a breast radiologist from our Institute with more than 20 years of experience. As depicted in Fig. 2, since in each image the region of interest was delimited by markers during the screening phase, an inpainting technique was implemented for these objects' removal and replacement. In detail, given an input image and selected a target region, this technique allows to remove pixels in this region and replace them using an exemplar-based texture synthesis, namely, the process of generating a new texture image perceptively equals to the input sample 24 . In this study, an inpainting technique based on coherence transport was adopted. Defined the inpainting domain, the coherence transport permits to propagate image values according to a transport equation in a coherent direction, that is the direction along which a specified degree of coherence among gray-level values is maintained 25 .
Subsequently, four different frames containing the ROI were extracted from each cleaned-up image. In Fig. 3, an example is depicted. The first image corresponds to the bounding box containing both the intra-tumoral and Table 3. Nodal status prediction in clinically negative breast cancer patients: comparison among the state-ofthe-art models performances.  www.nature.com/scientificreports/ the peritumoral region. The second image was automatically segmented implementing an inhouse region growing algorithm with the aim of isolating the intra-tumoral region. Starting from a center pixel, this technique allows to iteratively dilate the region of interest comparing pixels in the region with a neighboring one by means of a similarity measure, defined as the difference between the gray-level value of this candidate and the gray-level mean value of the area 26 . Then, on the intra-tumoral ROI, an erosion technique was applied for obtaining the fourth image. This procedure permits to expand the input object by a specified measure, without modifying the image contour 27 . In this study, a 2 cm dilation was performed, according to our radiologist's directives. Finally, the third image containing only the peritumoral region was obtained as the combined and intra-tumoral ROI subtraction.
Radiomic feature extraction. From each of the four previously pre-processed ROIs, 134 radiomic features were automatically extracted to describe the images texture. Basically, four different matrices were computed: the gray-level co-occurrence matrix (GLCM), the gray-level run length matrix (GLRLM), the gray-level size zone matrix (GLSZM) and the neighborhood gray-tone difference matrix (NGTDM). The statistical measures extracted from the GLCM characterize the image texture estimating how often pairs of pixels with specific values and in a specified spatial relationship occur in the image itself 28 . An account of the gray level runs, defined as the length in number of consecutive pixels characterized by the same gray level value, is given by the statistical features computed on the GLRLM 29 . The statistical measures extracted from the GLSZM allow to describe the amount of gray level zone, that are the number of connected areas that share the same gray level intensity 30 . Finally, a measure of the difference between the gray value of each pixel and the average gray value of its neighbors is given by the statistical features computed on the NGTDM 31 . Radiomic analysis was performed by means of Matlab toolboxes [32][33][34][35] , setting free parameters with default values and computing the matrices in the four possible directions ( 0°, 45°, 90°, 135°).

Statistical analysis.
With the aim of evaluating the contribution of each clinical feature on the ALN metastatic status, a preliminary statistical analysis was performed by means of the Mann-Whitney test for variables measured on a continuous scale and the Chi-square test for variables measured on a nominal scale. A feature was considered statistically significant if the performed statistical test returned a p-value less than 0.05. Subsequently, in order to ensure homogeneity between the training and test sets, a further statistical analysis was implemented to compare the distributions of both sub-sets with respect to all clinical features involved in this study. Two distributions were considered statistically different if the statistical test returned a p-value less than 0.05.
Learning model. By means of the features collected in the previous steps, eleven different learning models were developed and compared to determine the best one able to predict the ALN metastatic status. Originally, clinical and radiomic features were evaluated separately implementing two different machine learning approaches (see Fig. 4). Afterwards, their predictive power was estimated jointly adopting a third machine learning approach, namely soft voting (SV).
All models share both a prior dataset splitting step, by which the original sample is divided in hold-out training set (containing 80% of the sample) and hold-out test set (containing the remaining 20%), and the classification step, based on the support vector machine (SVM) classifier. SVM classifier is a supervised machine learning model which identifies, among all the possible hyperplanes which separate the two classes of data points, the one which has the maximum margin, that is the maximum distance between data points of both classes, by means of a kernel function. For our study the radial basis function was adopted 36 .
Although the clinical-based model consisted only of the data splitting and classification steps, the radiomicbased models required a feature selection stage to reduce the dataset dimension and avoid overfitting. Due to the relatively small dimension of the sample, the feature selection procedure was performed on the hold-out training set in a leave-one-out cross-validation scheme 37 by means of a genetic algorithm (GA), and different classification models were trained switching, in turns, the subset of features according to the frequency with which these resulted important over the cross-validation rounds. GA is an optimization algorithm based on the genetic theory in biology. As well as in biology stronger individuals have a higher probability to pass on their genes to their children, GA select the optimal combination of features evaluating the probability of different randomly generated populations of being the strongest possible parents [38][39][40] .
Finally, the SV approach allowed us to examine the mutual influence of the clinical-based model and each of the radiomic-based models in predicting the classification outcome. SV is a machine learning approach www.nature.com/scientificreports/ belonging to the ensemble methods class. Ensemble methods use the collective judgment of multiple classifiers trained on the same data for making a prediction: every individual classifier provides, for each element, a probability of belonging to a class, and the final prediction can be computed in different ways according to the designated technique. Specifically, SV combines individual predictions averaging the scores attributed by each of the considered model [41][42][43] . All the presented models were evaluated in terms of Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve and other standard metrics such as accuracy, sensitivity and specificity computed setting the threshold value equals to the ratio of the number of patients with positive ALN metastatic status over the total number of patients. Additionally, in order to estimate the variability of these metrics, the 95% confidence intervals were computed on 200 bootstrap rounds with the R library pROC.
Ethics approval and consent to participate. The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Scientific Board of Istituto Tumori 'Giovanni Paolo II' (Bari, Italy)-Prot. 6629/21. Consent for publication. Informed consent was obtained from all subjects and/or their legal guardian(s).   www.nature.com/scientificreports/ Reprints and permissions information is available at www.nature.com/reprints.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.